Scalable Vector Processors for Embedded Systems
نویسندگان
چکیده
Designers of embedded processors have typically optimized for low power consumption and low design complexity to minimize cost. Performance was a secondary consideration. Nowadays, many embedded systems (set-top boxes, game consoles, personal digital assistants, and cell phones) commonly perform computation-intensive media tasks such as video processing, speech transcoding, graphics, and high-bandwidth telecommunications. Consequently, modern embedded processors must provide high performance in addition to low cost. They must also be easy to scale and customize to meet the rigorous time-to-market requirements for consumer electronic products. The conventional wisdom for high-performance embedded processors is to use the superscalar or very large instruction word (VLIW) paradigms developed for desktop computing. Both approaches exploit instruction-level parallelism (ILP) in applications in order to execute in parallel a few operations per cycle. Superscalar processors detect ILP dynamically with hardware, which leads to increased power consumption and complexity. VLIW processors rely on the compiler to detect ILP, which leads to increased code size. Both approaches are difficult to scale because they require either significant hardware redesign (superscalar) or instruction-set redefinition (VLIW). Furthermore, scaling up either of the two exacerbates their initial disadvantages. This article advocates an alternative approach to embedded processors that provides high performance for critical tasks without sacrificing power efficiency or design simplicity. The key observation is that multimedia and telecommunications tasks contain large amounts of data-level parallelism (DLP). Hence, it’s not surprising that we revisit vector architectures, the paradigm developed for high performance with the large-scale DLP available in scientific computations. Just as superscalar and VLIW processors for desktop systems adjusted to accommodate embedded designs, we can revise vector architectures for supercomputers to serve in embedded applications. To demonstrate that vector architectures meet the requirements of embedded media processing, we evaluate the Vector IRAM, or VIRAM (pronounced “V-IRAM”), architecture developed at UC Berkeley, using benchmarks from the Embedded Microprocessor Christoforos E. Kozyrakis
منابع مشابه
For Embedded Applications with Data-level Parallelism, a Vector Processor Offers High Performance at Low Power Consumption and Low Design Complexity. unlike Superscalar and Vliw Designs, a Vector Processor Is Scalable and Can Optimally Match Specific
Designers of embedded processors have typically optimized for low power consumption and low design complexity to minimize cost. Performance was a secondary consideration. Nowadays, many embedded systems (set-top boxes, game consoles, personal digital assistants, and cell phones) commonly perform computation-intensive media tasks such as video processing, speech transcoding, graphics, and high-b...
متن کاملA Media-Enhanced Vector Architecture for Embedded Memory Systems
Next generation portable devices will require processors with both low energy consumption and high performance for media functions. At the same time, modern CMOS technology creates the need for highly scalable VLSI architectures. Conventional processor architectures fail to meet these requirements. This paper presents the architecture of Vector IRAM (VIRAM), a processor that combines vector pro...
متن کاملMPI for Embedded Systems: A Case Study
Distributed embedded processors are fast becoming the central stage in the architecture of embedded systems. With multiple processors, a distributed embedded system is more scalable towards either high performance or low power. The reduced workload on each processor creates new opportunities for dynamic voltage scaling (DVS); meanwhile the performance can be compensated by increased parallelism...
متن کاملA scalable single-chip multi-processor architecture with on-chip RTOS kernel
Now that system-on-chip technology is emerging, single-chip multi-processors are becoming feasible. A key problem of designing such systems is the complexity of their on-chip interconnects and memory architecture. It is furthermore unclear at what level software should be integrated. An example of a single-chip multi-processor for real-time (networked) embedded systems is the multi-microprocess...
متن کاملVector microprocessors for cryptography
Embedded security devices like ‘Trusted Platforms’ require both scalability (of power, performance and area) and flexibility (of software and countermeasures). This thesis illustrates how data parallel techniques can be used to implement scalable architectures for cryptography. Vector processing is used to provide high performance, power efficient and scalable processors. A programmable vector ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEEE Micro
دوره 23 شماره
صفحات -
تاریخ انتشار 2003